A Partial-Observation Stochastic Games: How to Win when Belief Fails
نویسندگان
چکیده
In two-player finite-state stochastic games of partial observation on graphs, in every state of the graph, the players simultaneously choose an action, and their joint actions determine a probability distribution over the successor states. The game is played for infinitelymany rounds and thus the players construct an infinite path in the graph. We consider reachability objectives where the first player tries to ensure a target state to be visited almost-surely (i.e., with probability 1) or positively (i.e., with positive probability), no matter the strategy of the second player. We classify such games according to the information and to the power of randomization available to the players. On the basis of information, the game can be one-sidedwith either (a) player 1, or (b) player 2 having partial observation (and the other player has perfect observation), or two-sided with (c) both players having partial observation. On the basis of randomization, (a) the players may not be allowed to use randomization (pure strategies), or (b) they may choose a probability distribution over actions but the actual random choice is external and not visible to the player (actions invisible), or (c) they may use full randomization. Our main results for pure strategies are as follows: (1) For one-sided games with player 2 having perfect observation we show that (in contrast to full randomized strategies) belief-based (subset-construction based) strategies are not sufficient, and we present an exponential upper bound on memory both for almost-sure and positive winning strategies; we show that the problem of deciding the existence of almost-sure and positive winning strategies for player 1 is EXPTIME-complete and present symbolic algorithms that avoid the explicit exponential construction. (2) For one-sided games with player 1 having perfect observation we show that non-elementary memory is both necessary and sufficient for both almost-sure and positive winning strategies. (3) We show that for the general (two-sided) case finite-memory strategies are sufficient for both positive and almost-sure winning, and at least non-elementary memory is required. We establish the equivalence of the almost-sure winning problems for pure strategies and for randomized strategies with actions invisible. Our equivalence result exhibit serious flaws in previous results of the literature: we show a non-elementary memory lower bound for almost-sure winning whereas an exponential upper bound was previously claimed.
منابع مشابه
Solving a Two-Period Cooperative Advertising Problem Using Dynamic Programming
Cooperative advertising is a cost-sharing mechanism in which a part of retailers' advertising investments are financed by the manufacturers. In recent years, investment among advertising options has become a difficult marketing issue. In this paper, the cooperative advertising problem with advertising options is investigated in a two-period horizon in which the market share in the second period...
متن کاملGame-Tree Search with Combinatorially Large Belief States
In games such as kriegspiel chess (a chess variant where players have no direct knowledge of the opponent’s pieces’ locations) the belief state’s sizes dwarf those of other partial information games like bridge, scrabble, and poker–and there is no easy way to generate states satisfying the given observations. We show that statistical sampling approaches can be developed to do well in such games...
متن کاملThe Complexity of Partial-Observation Stochastic Parity Games with Finite-Memory Strategies
We consider two-player partial-observation stochastic games on finite-state graphs where player 1 has partial observation and player 2 has perfect observation. The winning condition we study are ω-regular conditions specified as parity objectives. The qualitative-analysis problem given a partial-observation stochastic game and a parity objective asks whether there is a strategy to ensure that t...
متن کاملImitating Inscrutable Enemies: Learning from Stochastic Policy Observation, Retrieval and Reuse
In this paper we study the topic of CBR systems learning from observations in which those observations can be represented as stochastic policies. We describe a general framework which encompasses three steps: (1) it observes agents performing actions, elicits stochastic policies representing the agents’ strategies and retains these policies as cases. (2) The agent analyzes the environment and r...
متن کاملA. Condensed Matter Physics 1. Stochastic Strategies for Minority Games
The Minority Game is a model of interacting agents, who try to choose between two options every day, and those who are in the minority are winners. In an earlier paper, efficient stochastic strategies for this game was analyzed when the number of agents is 3, 5, 7. When the number of agents is large, the behaviour is qualitatively different, and as the discount parameter is varied, there is a s...
متن کامل